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ABSTRACT 

We present an axiomatic construction of hierarchical clustering in 
asymmetric networks where the dissimilarity from node a to node b 
is not necessarily equal to the dissimilarity from node b to node a. 
The theory is built on the axioms of value and transformation which 
encode desirable properties common to any clustering method. Two 
hierarchical clustering methods that abide to these axioms are de- 
rived: reciprocal and nonreciprocal clustering. We further show that 
any clustering method that satisfies the axioms of value and trans- 
formation lies between reciprocal and nonreciprocal clustering in a 
well defined sense. We apply this theory to the formation of circles 
of trust in social networks. 

Index Terms — Clustering, asymmetric networks. 

1. INTRODUCTION 

There are literally hundreds of methods, techniques, and heuristics 
that can be applied to the determination of hierarchical and non- 
hierarchical clusters in finite metric (thus symmetric) spaces - see, 
e.g., (T). Methods to identify clusters in a network of asymmetric 
dissimilarities, however, are rarer. A number of approaches reduce 
the problem to symmetric clustering by introducing symmetrizations 
that are justified by a variety of heuristics; e.g., |2 |. An idea that is 
more honest to the asymmetry in the dissimilarity matrix is the adap- 
tation of spectral clustering (3|{5) to asymmetric graphs by using a 
random walk perspective to define the clustering algorithm [6 | or 
through the minimization of a weighted cut (7). This relative rarity 
is expected because the intuition of clusters as groups of nodes that 
are closer to each other than to the rest is difficult to generalize when 
nodes are close in one direction but far apart in the other. 

To overcome this generic difficulty we postulate two particular 
intuitive statements in the form of the axioms of value and transfor- 
mation that have to be satisfied by allowable hierarchical clustering 
methods. The Axiom of Value states that for a network with two 
nodes the nodes are clustered together at the maximum of the two 
dissimilarities between them. The Axiom of Transformation states 
that if we consider a network and reduce all pairwise dissimilari- 
ties, the level at which two nodes become part of the same cluster is 
not larger than the level at which they were clustered together in the 
original network. In this paper we study the space of methods that 
satisfy the axioms of value and transformation. 

Although the theoretical foundations of clustering are not as well 
developed as its practice [8-10], the foundations of clustering in 
metric spaces have been developed over the past decade |TTffT4) . 
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Of particular relevance to our work is the case of hierarchical clus- 
tering where, instead of a single partition, we look for a family of 
partitions indexed by a resolution parameter; see e.g., (15), (16] Ch. 
4], and (17). In this context, it has been shown in |T8| that sin- 
gle linkage 1 16 Ch. 4] is the unique hierarchical clustering method 
that satisfies symmetric versions of the axioms considered here and 
a third axiom stating that no clusters can be formed at resolutions 
smaller than the smallest distance in the given data. One may think 
of the work presented here as a generalization of [18] to the case of 
asymmetric (non-metric) data. 

Our first contribution is the derivation of two hierarchical clus- 
tering methods that abide to the axioms of value and transformation. 
In reciprocal clustering closeness is propagated through links that are 
close in both directions, whereas in nonreciprocal clustering close- 
ness is allowed to propagate through loops (Section [4). We further 
prove that any clustering method that satisfies the value and transfor- 
mation axioms lies between reciprocal and nonreciprocal clustering 
in a well defined sense (Section[5]). 

2. PRELIMINARIES 

Consider a finite set of points X jointly specified with a dissimilarity 
function Ax to define the network N — (X, Ax). The set X repre- 
sent the nodes in the network. The dissimilarity Ax (x,x f ) between 
nodes x <E X and x G X is assumed to be non negative for all pairs 
(x, x') and null if and only if x = x . However, dissimilarity values 
Ax(x, x') need not satisfy the triangle inequality and, more conse- 
quential for the problem considered here, may be asymmetric in that 
it is possible to have Ax (x, x') / Ax (V, x). We further define M 
as the set of all possible networks N. 

A clustering of the set X is a partition Px defined as a collection 
of sets Px = {Bi, . . . , Bj} that are nonintersecting, Bi D Bj = 
for i / j, and are required to cover X, uf =1 Bi = X. In this paper 
we focus on hierarchical clustering methods whose outcomes are not 
single partitions Px but nested collections Dx of partitions Dx (5) 
indexed by a resolution parameter 5 > 0. For a given Dx, whenever 
at resolution 5 nodes x and x' are in the same cluster of Dx (5), we 
say that they are equivalent at resolution 8 and write x ~d x (5) x' . 
The nested collection Dx is termed a dendrogram and is required 
to satisfy the following two properties plus an additional technical 
property (see (18)): 

(Dl) Boundary conditions. For S = the partition Dx(0) clusters 
each x G X into a separate singleton and for some 5o sufficiently 
large Dx(So) clusters all elements of X into a single set, 

D x (0) = {{x}, iGl}, Dx(So) = jx} for some S . 
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Fig. 1. Axiom of value. Nodes in a two-node network cluster at the 
minimum resolution at which both can influence each other. 

(D2) Resolution. As S increases clusters can be combined but not 
separated. I.e., for any 5± < 62 if we have x ~d x (5i) x for some 
pair of points we must have x ~d x (s 2 ) x ' • 

As the resolution 8 increases, partitions Dx(S) become coarser im- 
plying that dendrograms are equivalent to trees; see figs. [T]and|2] 

Denoting by V the space of all dendrograms we define a hierar- 
chical clustering method as a function ?i : Af —> T> from the space 
of networks J\f to the space of dendrograms V. For the network 
N x (X,A X ) we denote by D x U(X,A X ) the output of 
clustering method 7i. When dissimilarities Ax conform to the defi- 
nition of a finite metric space, it is possible to show that there exists a 
hierarchical clustering method satisfying axioms similar to the ones 
proposed in this paper 1 18]. Furthermore, this method is unique and 
corresponds to single linkage. For resolution 5, single linkage makes 
x and x' part of the same cluster if they can be linked through a path 
of cost not exceeding 5. Formally, equivalence classes at resolution 
5 in the single linkage dendrogram SLx are defined as 

x~sl x (5)X min max A x (x*, x i+1 ) < 5. (1) 

Xy ' C(x,x') i\ Xi eC(x,x>) 

In ([I}, C(x,x') denotes a chain between x and x', i.e., an or- 
dered sequence of nodes connecting x and x' . We interpret 
max i | cc . eC ( cc cc /) Ax{xi,Xi+i) as the maximum dissimilarity cost 
we need to pay when traversing the chain C(x, x'). The right hand 
side of |T} is this maximum cost for the best selection of the chain 
C(x, x'). Recall that in |T} we are assuming metric data, which in 
particular implies A x (xi,x i+ i) = A x (xi+i,Xi). 

2.1. Dendrograms as ultrame tries 

Dendrograms are convenient graphical representations but otherwise 
cumbersome to handle. A mathematically more convenient repre- 
sentation is to identify dendrograms with finite ultrametric spaces. 
An ultrametric defined on the space X is a function ux :Ixl4 
R that satisfies the strong triangle inequality 

ux{x,x') < max x"), ux(x", x)^ , (2) 

on top of the reflexivity ux x') = ux (V, x), non negativity and 
identity properties ux(x,x') — <^4> x — x' . Hence, an ultra- 
metric is a metric that satisfies {2|, a stronger version of the triangle 
inequality. 

As shown in [18], it is possible to establish a bijective mapping 
between dendrograms and ultrametric s. 

Theorem 1 ( (IS) ) For a given dendrogram Dx define ux(x, x') 
as the smallest resolution at which x and x are clustered together 

ux(x, x') := min j# > 0, x ~d x (5) (3) 

The function u x is an ultrametric in the space X. Conversely, for a 
given ultrametric ux define the relation ~u x (5) as 

x ~u x (5) x u x {x,x) < 5. (4) 




Fig. 2. Axiom of transformation. If network Nx can be mapped to 
network Ny using a dissimilarity reducing map 0, nodes clustered 
together in Dx (S) at arbitrary resolution S must also be clustered in 
Dy(5). For example, x\ and X2 are clustered together at resolution 
5', therefore y\ and y<2 must also be clustered at this resolution. 

The relation ~u x (5) is an equivalence relation and the collection of 
partitions of equivalence classes induced by ~u x (5)> i-e- Ux(S) '■= 
{X mod ~u x (5) }> is a dendrogram. 

Given the equivalence between dendrograms and ultrametric s estab- 
lished by Theorem^we can think of hierarchical clustering methods 
ri as inducing ultrametrics in the set of nodes X based on dissim- 
ilarity functions Ax- The distance ux(x, x') induced by H. is the 
minimum resolution at which x and x are co-clustered by H. 

3. VALUE AND TRANSFORMATION 

To study hierarchical clustering algorithms in the context of asym- 
metric networks, we start from two intuitive notions that we translate 
into the axioms of value and transformation. The Axiom of Value 
is obtained from considering a two-node network with dissimilari- 
ties a and f3; see Fig. [T] In this case, it makes sense for nodes p 
and q to be in separate clusters at resolutions S < max (a, (3). For 
these resolutions we have either no influence between the nodes, if 
S < min (a, f3), or unilateral influence from one node over the other, 
when min (a, (3) < 5 < max(a, (3). In either case both nodes are 
different in nature. E.g., if we think of the network as a Markov 
chain, nodes p and q form separate classes. We thus require nodes p 
and q to cluster at resolution S = max (a, /3). This is somewhat arbi- 
trary, as any number larger than max (a, /3) would work. As a value 
claim, however, it means that the clustering resolution parameter S is 
expressed in the same units as the elements of the dissimilarity ma- 
trix. A formal statement in terms of ultrametric distances follows. 

(Al) Axiom of Value. Consider a two-node network N = (X, Ax) 
with X = {p, q}, Ax(p, q) = cx, and Ax(q,p) — (3. The ultramet- 
ric (X, ux) = W>(X, Ax) produced by % satisfies 

ux(p,q) = max(a,/?). (5) 

The second restriction on the space of allowable methods ri for- 
malizes the expected behavior upon a modification of the dissimilar- 
ity matrix; see Fig. [5] Consider networks Nx = (X, Ax) and 
Ny (Y,A y ) and denote by D x H(X,A X ) and D Y 
H(YjAy) the corresponding dendrogram outputs. If we map all 
the nodes of the network N x — (X, Ax ) into nodes of the network 
Ny = (y, Ay) in such a way that no pairwise dissimilarity is in- 
creased we expect the network to become more clustered. In terms 
of the respective clustering dendrograms we expect that nodes co- 
clustered at resolution 8 in Dx are mapped to nodes that are also 
co-clustered at this resolution in Dy • The Axiom of Transformation 
is a formal statement of this requirement as we introduce next. 
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Fig. 3. Reciprocal clustering. Nodes x and x are clustered together 
at resolution S if they can be joined with a (reciprocal) chain whose 
maximum dissimilarity is smaller than 5 in both directions [cf. {7J]. 
Of all methods that satisfy the axioms of value and transformation, 
the reciprocal ultrametric is the largest between any pair of nodes. 

(A2) Axiom of Transformation. Consider two networks N x = 
(X, A x ) and Ny = (Y,Ay) and a dissimilarity reducing map 
(j) : X —> Y, i.e. a map (j) such that for all x, x G X it holds 
A x (x,x') > Ay (4>(x), 4>(x')). Then, the output ultrametrics 
(X, ux) = H(X, A x ) and (Y, u Y ) = H(Y, A Y ) satisfy 



u x (x,x) > u Y (4>{x), (j){x')). 



(6) 



A hierarchical clustering method H is admissible if it satisfies Ax- 
ioms (Al) and (A2). Axiom (Al) states that units of the clustering 
resolution parameter S are the same units of the elements of the dis- 
similarity matrix. Axiom (A2) states that if we reduce dissimilari- 
ties, clusters may be combined but cannot be separated. 

4. RECIPROCAL AND NONRECIPROCAL CLUSTERING 

An admissible clustering method satisfying axioms (A1)-(A2) 
can be constructed by considering the symmetric dissimilarity 
Ax{x,x') — max (Ax{x, x'), Ax(x', x)), for all x, x £ X. 
This effectively reduces the problem to clustering of symmetric 
data, a scenario in which single linkage {I} is known to satisfy ax- 
ioms similar to (A1)-(A2), [18]. Drawing upon this connection we 
define the reciprocal clustering method H R with ultrametric out- 
puts (X,u x ) = H R (X,A X ) as the one for which the ultrametric 
u x (x, x ; ) between nodes x and x is given by 



R ( i\ 



min max Ax(a^,a^+i)- 



(7) 



An illustration of the definition in (^J is shown in Fig. [3] We search 
for chains C(x, x') linking nodes x and x . For a given chain we 
walk from x to x and determine the maximum dissimilarity, in ei- 
ther the forward or backward direction, across all the links in the 
chain. The reciprocal ultrametric u x x') between nodes x and x' 
is the minimum of this value across all possible chains. Recalling 
the equivalence of dendrograms and ultrametrics in Theorem [T] we 
know that the dendrogram produced by reciprocal clustering clus- 
ters x and x' together for resolutions 6 > u x (x,x'). Combining 
this latter observation with ([7| and denoting by Rx the reciprocal 
dendrogram we write the reciprocal equivalence classes as 



J Rx(S) x 



min max A x (xi, 

C(x,x f ) i\xi(zC(x,x') 



x i+ i) < 6. (8) 



Comparing {SJ with the definition in (TJ), we see that reciprocal clus- 
tering is equivalent to single linkage for the network N = (X, A x ). 

For the method H R specified in (|7| to be a proper hierarchical 
clustering method we need to show that u R is an ultrametric. For 
H R to be admissible it needs to satisfy axioms (A1)-(A2). Both of 
these properties are true as stated in the following proposition. 

Proposition 1 The reciprocal clustering method 1~L R is valid and 
admissible. I.e., u R as defined by {7J is a valid ultrametric and the 
method satisfies axioms (Al )-(A2). 
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Fig. 4. Nonreciprocal clustering. Nodes x and x' are co-clustered 
at resolution S if they can be joined in both directions with possi- 
bly different (nonreciprocal) chains of maximum dissimilarity not 
greater than S [cf. <fl~Q]>] . The nonreciprocal ultrametric is the small- 
est among all that abide to the value and transformation axioms. 



Proof: See (19). ■ 

In reciprocal clustering, nodes x and x are joined together if we 
can go back and forth from x to x' at a maximum cost 5 through 
the same chain. In nonreciprocal clustering we relax the restriction 
that the chain achieving the minimum cost must be the same in both 
directions and cluster nodes x and x together if there are, possibly 
different, chains linking x to x' and x' to x. To state this definition 
in terms of ultrametrics, consider a given network Nx — (X, Ax ) 
and define the unidirectional minimum chain cost 



~NR ( l \ 

u x (x,x) 



min max Ax (x% 

C(x,x') i\xi<EC(x,x') 



(9) 



We define the nonreciprocal clustering method H NR with ultramet- 
ric outputs (X, u% R ) = H NR (X, A x ) as the one for which the ul- 
trametric Ux R (x, x) between nodes x and x is given by the max- 
imum of the unidirectional minimum chain costs Ux R (x,x) and 
Ux R (x\ x) in each direction, 



NR ( I \ 

u x (x, X ) 



\u x \x,x ), u x (x ,x)j. (10) 



An illustration of the definition in |To| ) is shown in Fig. [4] We 
consider forward chains C(x, x') going from x to x and backward 
chains C(x', x) going from x' to x. For each of these chains we de- 
termine the maximum dissimilarity across all the links in the chain. 
We then search independently for the best forward chain C(x, x) 
and the best backward chain C(x ; , x) that minimize the respective 
maximum dissimilarities across all possible chains. The nonrecipro- 
cal ultrametric Ux R (x, x) between nodes x and x is the maximum 
of these two minimum values. 

As is the case with reciprocal clustering we can verify that u NR 
is a properly defined ultrametric. We also show that H NR is admis- 
sible in the following proposition. 

Proposition 2 The nonreciprocal clustering method 1~L NR is valid 
and admissible. That is, u NR as defined by fT0] l is a valid ultrametric 
and the method satisfies axioms (Al )-(A2). 

Proof: See g9). ■ 

Remark 1 Reciprocal and nonreciprocal clustering are different in 
general. However, for symmetric networks, they are equivalent and 
coincide with single linkage as defined by (|TJ. To see this, note that 
in the symmetric case Ux R (x, x) = Ux R (x\ x). Therefore, from 



u x R (x,x'). Comparing (9} and (7| we get 



the equivalence of nonreciprocal and reciprocal clustering by noting 
that dissimilarities Ax — Ax for the symmetric case. By further 
comparison with |T} the equivalence with single linkage follows. 



5. EXTREMAL ULTRAMETRICS 

Given that we have constructed two admissible methods satisfying 
axioms (A1)-(A2), the question arises of whether these two construc- 
tions are the only possible ones and if not whether they are special in 
some sense, if any. One can find constructions other than reciprocal 
and nonreciprocal clustering that satisfy axioms (A1)-(A2). How- 
ever, we prove in this section that reciprocal and nonreciprocal clus- 
tering are a peculiar pair in that all possible admissible clustering 
methods are contained between them in a well defined sense. To 
explain this sense properly, observe that since reciprocal chains (see 
Fig. [3} are particular cases of nonreciprocal chains (see Fig. |4]) we 
must have that for all pairs of nodes x, x 



(x, X ) < X ). 



(11) 



I.e., nonreciprocal clustering distances do not exceed reciprocal clus- 
tering distances. An important characterization is that any method 
H satisfying axioms (A1)-(A2) yields ultrametrics that lie between 
Ux R (x,x') and u x {x,x') as we formally state next. 

Theorem 2 Consider an admissible clustering method that is a 
clustering method satisfying axioms (A1)-(A2). For arbitrary given 
network N = (X, Ax) denote by (X, ux) — Ax) the out- 

come of?{ applied to N. Then, for all pairs of nodes x, x 



(x,x ) < ux{x,x ) < u x {x,x j, 



(12) 



where Ux R (x, x) and u x (x, x) denote the nonreciprocal and re- 
ciprocal ultrametrics as defined by ( |10| > and ([7]), respectively. 

Proof: See (T5). ■ 

According to Theorem[2] nonreciprocal clustering applied to the 
network N — (X , Ax) yields a uniformly minimal ultrametric that 
satisfies axioms (A1)-(A2). Reciprocal clustering yields a uniformly 
maximal ultrametric. Any other clustering method abiding to (Al)- 
(A2) yields an ultrametric such that the distances ux (x, x') between 
any two pairs of nodes lie between the distances Ux R (x,x f ) and 
u x {x,x f ) assigned by reciprocal and nonreciprocal clustering. In 
terms of dendrograms, (12} implies that among all possible clus- 
tering methods, the smallest possible resolution at which nodes are 
clustered together is the one corresponding to nonreciprocal cluster- 
ing. The highest possible resolution is the one that corresponds to 
reciprocal clustering. 

Remark 2 From Remark[T] the upper and lower bounds in (T2] l co- 
incide with single linkage for symmetric networks. Thus, \\2\ be- 
comes an equality in such context. Since metric spaces are particular 
cases of symmetric networks, Theorem[2]recovers the uniqueness re- 
sult in 1 18 1 and extends it to symmetric - but not necessarily metric 
- data. Further, the result in fT8) is based on three axioms, two of 
which are the symmetric particular cases of the axioms of value and 
transformation. It then follows that one of the three axioms in [18] 
is redundant. See [19] for details. 

6. CIRCLES OF TRUST 

We apply the theory developed to the formation of trust clusters - 
circles of trust - in social networks [ 20]. Recalling the equivalence 
between dendrograms and ultrametrics, it follows that we can think 
of trust propagation in a network as inducing a trust ultrametric Tx ■ 
The induced trust distance bound Tx(x,x') < S signifies that, at 
resolution 5, individuals x and x are part of a circle of trust. Since 




Fig. 5. Nonreciprocal (left) and reciprocal (right) clustering for an 
online social network [21]. Dissimilarities are inversely proportional 
to the number of messages sent between any two users. Dendrogram 
closeups shown in second row. 

axioms (A1)-(A2) are reasonable requirements in the context of trust 
networks Theorem [2] implies that the trust ultrametric must satisfy 



l {x,x) < T x {x,x) < u x (x,x), 



(13) 



which is just a reinterpretation of fT2] l. While |T3] > does not give a 
value for trust ultrametrics, reciprocal and nonreciprocal clustering 
provide lower and upper bounds in the formation of circles of trust. 

As a numerical application consider an online social network of 
a community of students at the University of California at Irvine, 
[21]. In Fig. [5}top, we depict both clustering algorithms for a subset 
of the users of the social network. The dissimilarity between nodes 
has been normalized as a function of the messages sent between any 
two users where lower distances represent more intense exchange. 
Note that although the ultrametrics are lower for the nonreciprocal 
case - as they should {TT} -, the overall structure is similar. The 
similarity between both dendrograms could be interpreted as an in- 
dicator of symmetry in the communication. Indeed, in a completely 
symmetric case both dendrograms would coincide. However, there 
is another source of similarity between the two proposed algorithms 
which can be interpreted as consistent asymmetry. For example, 
someone who rarely replies to a message regardless of the sender 
or someone who sends messages but gets few replies regardless of 
the receiver. The similarity between both dendrograms hints that 
answering all messages of some people but none of others is not 
ubiquitous. 

Fig. [5jbottom presents a closeup of the dendrograms in Fig. [5} 
top and the major cluster at resolution 5 = 0.3 is highlighted in red. 
We see that this cluster has a different hierarchical genesis in both 
algorithms. I.e., the two clustering methods alter the clustering or- 
der between nodes, which in terms of ultrametrics corresponds to an 
inversion of the nearest neighbors ordering. Nonetheless, at S = 0.3 
the red cluster contains the same nodes in both clustering methods. 
This implies that, for the given resolution, this cluster constitutes a 
circle of trust for any choice of admissible trust metric Tx . 

7. CONCLUSION 

An axiomatic construction of hierarchical clustering in asymmetric 
networks was presented. Based on two axioms proposed, the axioms 
of value and transformation, two particular clustering methods were 
developed: reciprocal and nonreciprocal clustering. Furthermore, 
these methods were shown to be well-defined extremes of all possi- 
ble clustering methods satisfying the proposed axioms. Finally, the 
theoretical developments were applied to real data in order to study 
the formation of circles of trust in social networks. 
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